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ABSTRACT 

In recent years, the number of pulsars with secure mass measurements has increased to a level 
that allows us to probe the underlying neutron star mass distribution in detail. We critically review 
radio pulsar mass measurements and present a detailed examination through which we are able to put 
stringent constraints on the underlying neutron star mass distribution. For the first time, we are able 
to analyze a sizable population of neutron star-white dwarf systems in addition to double neutron 
star systems with a technique that accounts for systematically different measurement errors. We find 
that neutron stars that have evolved through different evolutionary paths reflect distinctive signatures 
through dissimilar distribution peak and mass cutoff values. Neutron stars in double neutron star and 
neutron star-white dwarf systems show consistent respective peaks at 1.35 Mq and 1.50 Mq which 
suggest significant mass accretion (Am k 0.15 M Q ) has occurred during the spin up phase. The 
width of the mass distribution implied by double neutron star systems is indicative of a tight initial 
mass function while the inferred mass range is significantly wider for neutron stars that have gone 
through recycling. We find a mass cutoff at 2 M Q for neutron stars with white dwarf companions 
which establishes a firm lower bound for the maximum neutron star mass. This rules out the majority 
of strange quark and soft equation of state models as viable configurations for neutron star matter. 
The lack of truncation close to the maximum mass cutoff suggests that the 2M Q limit is set by 
evolutionary constraints rather than nuclear physics or general relativity, and the existence of rare 
super-massive neutron stars is possible. 

Subject headings: stars: neutron — pulsars: general — binaries: general — stars: fundamental pa- 
rameters — stars: statistics 



1. INTRODUCTION 

The mass of a neutron star (NS) has been a prime fo- 
cus of compact objects astrophysics since the discovery 
of neutrons. Soon after Chadwick's Letter on the "Pos- 
sible existence of a neutron" (1932), heated discussions 
around the world started to take place on the poten- 
tial implications of the discovery. In 1932, during one 
of these discussions in Copenhagen, Landau shared his 
views with Roscnfcld and Bohr where he anticipated the 
existence of a dense-compact star composed primarily of 
neutrons (e.g., Shapiro & Teukolsky 1983, p. 242). The 
prediction was not officially announced until Baade & 
Zwicky published their work where the phrase "neutron 
star" appeared in the literature for the first time (Baade 
& Zwicky 1934a). Their following work explained the 
possible evolutionary process leading to the production 
of a NS and the physics that simultaneously constrains 
the mass and radius in more detail (Baade & Zwicky 
1934b,c). 

The ensuing discussions were primarily focused on the 
mass of these dense objects. In 1931, Chandrasckhar 
had already published his original work in which he cal- 
culates the upper mass limit of an "ideal" white dwarf 
as 0.91 Mq, while the following year, Landau intuitively 
predicted that a limiting mass should exist close to 
1.5 Mq (Landau 1932). Following the works of Chan- 
drasekhar and Landau, and using the formalism devel- 
oped by Tolman, Oppcnhcimer & Volkoff predicted an 
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upper mass limit for NSs to be 0.7-3.4 M (Tolman 1939; 
Oppenheimer & Volkoff 1939). 

Since then, continuing discussions on the mass range 
a NS can attain have spawned a vast literature (e.g. 
Rhoadcs & Ruffini 1974; Joss & Rappaport 1976 
Thorsett & Chakrabarty 1999; Baumgarte et al. 2000 
Schwab et al. 2010, and references therein). 

Masses of NSs at birth are tuned by the intricate de- 
tails of the astrophysical processes that drive core col- 
lapse and supernova explosions (Timmes et al. 1996). 
The birth mass is therefore of particular interest to those 
who study these nuclear processes. An earlier attempt 
by Finn (1994) finds that NSs should predominantly fall 
in the 1.3-1.6 M Q mass range. The most comprehensive 
work to date by Thorsett & Chakrabarty (1999) finds 
that the mass distribution of observed pulsars are con- 
sistent with M = 1.38Tq'5q M©, a remarkably narrow 
mass range. The recent work of Schwab et al. (2010) on 
the other hand, argues that there is evidence for multi- 
modality in the NS birth mass distribution (for discus- 
sion, see §8). 

The maximum possible mass of a NS has attracted par- 
ticular attention because it delineates the low mass limit 
of stellar mass black holes (Rhoades & Ruffini 1974; Fryer 
& Kalogcra 2001). When combined with measurements 
of NS radii, it also provides a distinctive insight into the 
structure of matter at supranuclear densities (Cook et al. 
1994; Haensel 2003; Lattimer & Prakash 2004, 2007). Al- 
though more modern values theoretically predict a max- 
imum NS mass of M max ss 2.2-2.9 M@ (Bombaci 1996; 
Kalogera & Baym 1996; Hcisclbcrg & Pandharipandc 
2000), it is still unclear whether very stiff equations of 
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states (EOSs) that stably sustain cores up to the general 
relativity limit (~ 3M©) can exist. 

Recent observations of pulsars in the Galactic plane 
as well as globular clusters suggest that there may be, 
in fact, NSs with masses significantly higher than the 
canonical value of 1.4 M© (e.g., Champion et al. 2008; 
Ransom et al. 2005; Freire 2008; Freire et al. 2008a,b). 
NSs in X-ray binaries also show systematic deviations 
from the canonical mass limit (e.g., van Kerkwijk et al. 
1995; Barziv ct al. 2001; Quaintrell et al. 2003; van dcr 
Meer et al. 2005; Ozel et al. 2009; Giiver et al. 2010). 

The most precise measurements of NS masses are 
achieved by estimating the rclativistic effects to orbital 
motion in binary systems. The exquisite precision of 
these mass measurements presents also a unique means 
to test general relativity in the "strong field" regime (e.g., 
Damour & Taylor 1992; Psaltis 2008). As the masses of 
NSs also retain information about the past value of the 
effective gravitational constant G, with the determina- 
tion of the NS mass range it may be even possible to 
probe the potential evolution of such physical constants 
(Thorsett 1996). 

A comprehensive insight into the underlying mass dis- 
tribution of NSs thus provides not only the means to 
study NS specific problems. It also offers diverse sets of 
constraints that can be as broad as the high-mass star 
formation history of the Galaxy (Gould 2000), or as par- 
ticular as the compression modulus of symmetric nuclear 
matter (Glcndenning 1986; Lattimcr et al. 1990). 

The present work aims to set up a framework by which 
we can probe the underlying mass distributions implied 
by radio pulsar observations. We take a comparative ap- 
proach and analyze the results that we obtain through 
both conventional (maximum likelihood estimation) and 
modern (Bayesian) statistical methods. Unlike conven- 
tional statistical methods, with a Bayesian approach it 
is possible to separately infer peaks, shapes and cutoff 
values of the distribution with appropriate uncertainty 
quantification. This gives us unique leverage to probe 
these parameters which separately trace independent as- 
trophysical and evolutionary processes. 

In order to prevent contamination of the population, 
which may lead to systematic deviations from the probed 
mass distribution, we keep the observed pulsar sample 
as uniform as possible. We choose mass measurements 
that do not have strong a priori model dependencies and 
therefore can be considered secure. 

In §2 we review theoretical constraints on NS masses. 
We derive useful quantities such as the NS birth mass 
Mbirth, the amount of mass expected to be transfered 
onto the NS primary during recycling Am acc , and the vi- 
able range of maximum mass cutoff value M max for NSs. 
The observations are reviewed in §3. The evolutionary 
paths that may produce double neutron star (DNS) and 
neutron star-white dwarf (NS-WD) systems are summa- 
rized in §4. We describe the statistical approach used to 
probe the underlying NS mass distribution in §5 and de- 
tail the process through which we test the performance 
in §6. After we summarize in §7, the range of implica- 
tions and following conclusions are discussed in §8. For 
brevity, the details of the algorithm and the analytical 
derivation of the numerical method for estimation are 
included as an appendix (§A and §B). 



2. THEORETICAL CONSTRAINTS 

2.1. Birth Mass 

The canonical mass limit M c h ~ 1.4 M© is the crit- 
ical mass beyond which the degenerate remnant core 
of a massive star or a white dwarf will lose gravita- 
tional stability and collapse into a NS. This limiting 
mass is an approximation which is sensitive to several nu- 
clear, rclativistic and geometric effects (sec Ghosh 2007; 
P. Haensel, A. Y. Potckhin, & D. G. Yakovlev 2007, for 
review). In addition to these effects, the variety of evo- 
lutionary processes that produce NSs warrant a careful 
treatment. 

A more precise parametrization of the Chandrasekhar 
mass is 

M cft = 5.83Y e 2 (1) 

where Y e = n p /(n p + n n ) is the electron fraction. A 
perfect neutron-proton equality (n p = n n ) with Y e = 

0. 50 yields a critical mass of 

M ch = 1.457 M©. (2) 

However, we have a sufficiently good insight into the pro- 
cesses that affect M c h- So, we can go beyond the ideal- 
ized cases and estimate the remnant's expected initial 
mass more realistically. 

The inclusion of more reasonable electron fractions 
(Y e < 0.50) yields smaller values for M c h- General rela- 
tivistic implications, surface boundary pressure correc- 
tions, and the reduction of pressure due to non-ideal 
Coulomb interactions (e~-e~ repulsion, ion-ion repulsion 
and e _ -ion attraction) at high densities all reduce the 
upper limit of M c h- 

On the other hand, the electrons of the progenitor (i.e., 
white dwarf or the core of a massive star) material are 
not completely relativistic. This reduces the pressure 
leading to an increase in the amount of mass required 
to reach the gravitational potential to collapse the star. 
Finite entropy corrections and the effects of rotation will 
also enhance the stability for additional mass. These 
corrections, as a result, yield a higher upper limit for 
M ch . 

The level of impact on the birth masses due to some 
of these competing effects is not well constrained as the 
details of the processes are not well understood. An in- 
clusion of the effects that are due to the diversity in the 
evolutionary processes alone requires a w 20% correction 
(for a detailed numerical treatment see Butterworth & 
Ipser 1975) and therefore implies a broader mass range, 

1. e. M ch ~ 1.17-1.75 M©. 

The measured masses, however, are the effective grav- 
itational masses rather than a measure of the baryonic 
mass content. After applying the quadratic correction 
term M baryon - M grav « 0.075 M* rav , we get 

M birt h ~ 1.08-1.57 M© (3) 
as a viable range for gravitational NS masses at birth. 

2.2. Accreted Mass 

There is considerable evidence that at least some mil- 
lisecond pulsars have evolved from a first generation of 
NSs which have accumulated mass and angular momen- 
tum from their evolved companion (Alpar ct al. 1982; 
Radhakrishnan & Srinivasan 1982; Wijnands & van der 
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Klis 1998; Markwardt et al. 2002; Galloway et al. 2002, 
2005). There is also a line of arguments that support 
the possibility of alternative evolutionary processes that 
may enrich the millisecond pulsar population (Bailyn & 
Grindlay 1990; Kiziltan & Thorsett 2009). 

Possible production channels for isolated millisecond 
pulsars are mergers of compact primaries or accretion in- 
duced collapse (AIC). In the case where a NS is produced 
via AIC, the final mass configuration of the remnant is 
determined by the central density of the progenitor (C-0 
or O-Ne white dwarf) and the speed at which the conduc- 
tive deflagration propagates (Wooslcy & Weaver 1992). 

While there are uncertainties for the parameters that 
describe the ignition and flame propagation, a careful 
treatment of the physics that tune the transition of an 
accreting white dwarf yields a unique baryonic mass 
^baryon ~ 1.39 M e for the remnant which gives a gravi- 
tational mass of Mgrav ~ 1-27 M Q for NSs produced via 
AIC (Timmes et al. 1996). There is indirect evidence 
that the occurrence rate of AICs can be significant (Bai- 
lyn & Grindlay 1990). 

The physics of these production channels are still not 
understood well enough to make quantitative predictions 
of the NS mass distribution produced via these processes. 
But we can estimate the mass required to spin NSs up 
to millisecond periods by using timescale and angular 
momentum arguments. 

For low mass X-ray binaries (LMXBs) accreting at 

typical rates of m ~ 10 _3 MEdd, the amount of mass 
accreted onto a NS in 10 10 yr is Am 0.10 M Q . We 
can also estimate the amount of angular momentum re- 
quired to spin the accreting progenitor up to velocities 
that equal the Keplerian velocity at the co-rotation ra- 
dius. In order to transfer sufficient angular momentum 
(L = I x lo) and spin up a normal pulsar (R «12 km, 
/ w 1.4 x 10 45 gem 2 ) to millisecond periods, an addi- 
tional mass of Am 0.20 M q is required. Hence, 

Am acc w 0.10-0.20 M Q (4) 

will be sufficient to recycle NS primaries into millisecond 
pulsars. 

2.3. Maximum Mass 

The mass and the composition of NSs are intricately 
related. One of the most important empirical clues that 
would lead to constraints on a wide range of physical 
processes is the maximum mass of NSs. For instance, 
secure constraints on the maximum mass provide insight 
into the range of viable EOSs for matter at supranuclear 
densities. 

A first order theoretical upper limit can be obtained by 
numerically integrating the Oppenheimer-Volkoff equa- 
tions for a low-density EOS at the lowest energy state 
of the nuclei (Baym et al. 1971). This yields an ex- 
treme upper bound to the maximum mass of a NS at 
M max ~ 3.2 M Q (Rhoades & Ruffini 1974). Any compact 
star to stably support masses beyond this limit requires 
stronger short-range repulsive nuclear forces that stiffens 
the EOSs beyond the causal limit. For cases in which 
causality is not a requisite (v — > oo) an upper limit still 
exist in general relativity ps 5.2 M q that considers uni- 
form density spheres (Shapiro & Teukolsky 1983). How- 
ever, for these cases the extremely stiff EOSs that require 



the sound speed to be super-luminal {dP/dp > c 2 ) are 
considered non-physical. 

Differentially rotating NSs that can support signifi- 
cantly more mass than uniform rotators can be tem- 
porarily produced by binary mergers (Baumgarte et al. 
2000). While differential rotation provides excess ra- 
dial stability against collapse, even for modest magnetic 
fields, magnetic braking and viscous forces will inevitably 
bring differentially rotating objects into uniform rotation 
(Shapiro 2000). Therefore, radio pulsars can be treated 
as uniform rotators when calculating the maximum NS 
mass. 

While general relativity along with the causal limit 
put a strict upper limit on the maximum NS mass at 
~ 3.2 Mq, the lower bound is mostly determined by the 
still unknown EOS of matter at these densities and there- 
fore is not well constrained. There arc modern EOSs 
with detailed inclusions of nuclear processes such as kaon 
condensation and nucleon-nucleon scattering which affect 
the stiffness. These EOSs give a range of 1.5-2.2 M Q as 
the lower bound for the maximum NS mass (Thorsson 
et al. 1994; Kalogera & Baym 1996). Although these 
lower bounds for a maximum NS mass are implied for 
a variation of more realistic EOSs, it is still unclear 
whether any of these values are favored. Therefore, 

M max ~ 1.5-3.2 M (5) 

can be considered a secure range for the maximum NS 
mass value. 

3. OBSERVATIONS 

The timing measurements of radio pulsations from NSs 
offer a precise means to constrain orbital parameters 
(Manchester & Taylor 1977). For systems where only five 
Keplerian orbital parameters (orbital period: P5, pro- 
jected semi-major axis: x, eccentricity: e, longitude and 
the time of periastron passage: uj , T ) are measured, in- 
dividual masses of the primary (mi) and secondary (17*2) 
stars, and the orbital inclination i cannot be separately 
constrained. They remain instead related by the mea- 
sured mass function / which is given by 

(m 2 sini)3 = /2vr\ 2 . 
J M 2 \P b J V ' 

where M = mi + m% and masses are in solar units, the 
constant T Q = GM Q /c 3 = 4.925490947/iS, and x is mea- 
sured in light seconds. 

For some binary systems, the timing residuals cannot 
be modeled with only Keplerian parameters when the ef- 
fects of general relativity are measurable. In these cases, 
the gravitational influence can be parametrized as five 
potentially measurable post-Keplerian (PK) parameters 
which have similar interpretations (Taylor 1992); (1) 6j: 

advance of periastron (2) Pt,: orbital period decay (3) 
7: time dilation-gravitational rcdshift (4) r: range of 
Shapiro delay (5) s: shape of Shapiro delay, where these 
are described by 

- = 3(g) _5/3 (T M) 2 / 3 (l- e 2 )- 1 , (7) 
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Table 1 

Double neutron star systems 



Pulsar 



Mass [Mq] 68% central limits Refs.' 



Double neutron star binaries 



J0737-3039 
pulsar A 
pulsar B 
total 


1.3381 
1.2489 
2.58708 


±0.0007 
±0.0007 
±0.00016 




[1] 


J1518+4904 
pulsar 
companion 
total 


1.56 
1.05 
2.61 


+0.13/ - 
+0.45/ - 
±0.070 


0.44 
0.11 


[2] 


B1534+12 
pulsar 
companion 
total 


1.3332 
1.3452 
2.678428 


±0.0010 
±0.0010 
±0.000018 


[3] 


J1756-2251 
pulsar 
companion 
total 


1.40 
1.18 
2.574 


+0.02/ - 
+0.03/ - 
±0.003 


0.03 
0.02 


[i] 


J1811-1736 
pulsar 
companion 
total 


1.56 
1.12 
2.57 


+0.24/ - 
+0.47/ - 
±0.10 


0.45 
0.13 


[5, 6] 


J1829+2456 
pulsar 
companion 
total 


1.20 
1.40 
2.59 


+0.12/ - 
+0.46/ - 
±0.02 


0.46 
0.12 


I'} 


J1906+0746 
pulsar 
companion 
total 


1.248 
1.365 
2.61 


±0.018 
±0.018 
±0.02 




[8, 9] 


B1913+16 
pulsar 
companion 
total 


1.4398 
1.3886 
2.82843 


±0.002 
±0.002 
±0.0002 




[10, 11] 


B2127+11C 
pulsar 
companion 
total 


1.358 
1.354 
2.71279 


±0.010 
±0.010 
±0.00013 




[12] 



a References: 1: Kramer et al. (2006), 2: Thorsett & 
Chakrabarty (1999), 3: Stairs et al. (2002), 4: Faulkner 
et al. (2005), 5: Stairs (2006), 6: Corongiu et al. (2007), 
7: Champion et al. (2005), 8: Kasian (2008), 9: Lorimer 
et al. (2006), 10: Weisberg et al. (2010), 11: Taylor (1992), 
12: Jacoby et al. (2006) 
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Table 2 

white dwarf binary systems 



Pulsar 



Mass [Mq] 68% central limits Refs. : 



Neutron star - white dwarf binaries 



J0437-4715 


1.76 


±0.20 






1 




J0621+1002 


1.70 


+0.10/ 


- 0.17 




2 




J0751+1807 


1.26 


±0.14 






2 




J1012+5307 


1.64 


±0.22 






3 




J1141-6545 


1.27 


±0.01 






1 




J1614-2230 


1.97 


±0.04 






5 




J1713+0747 


1.53 


+0.08/ 


- 0.06 




6 




J1802-2124 


1.24 


±0.11 






7 




B1855+09 


1.57 


+0.12/ 


- 0.11 




8 




J1909-3744 


1.438 


±0.024 






9 




B2303+46 


1.38 


+0.06/ 


- 0.10 


[10 





Neutron stars in globular clusters 



J0024-7204H 


1.48 


+0.03/ 


-0.06 




* 




J0514-4002A 


1.49 


+0.04/ 


- 0.27 




* 




B1516+02B 


2.10 


±0.19 






* 




J1748-2446I 


1.91 


+0.02/ 


- 0.10 




* 




J1748-2446J 


1.79 


+0.02/ 


- 0.10 




* 




B1802-07 


1.26 


+0.08/ 


- 0.17 


[10 




B1911-5958A 


1.40 


+0.16/ 


- 0.10 


[11 





a References: *: This work; Freire (personal communica- 
tion), 1: Verbiest et al. (2008), 2: Nice et al. (2008), 3: 
Callanan et al. (1998), 4: Bhat et al. (2008), 5: Demorest 
ct al. (2010) 6: Splaver et al. (2005), 7: Ferdman ct al. 
(2010), 8: Nice et al. (2003), 9: Jacoby et al. (2005), 10: 
Thorsett & Chakrabarty (1999), 11: Bassa ct al. (2006) 



A comprehensive review of the observational techniques 
and measurements can be found in Lorimer & Kramer 
(2004) and Stairs (2006). 

In systems where at least two PK parameters can be 
measured, mi and may be individually determined. 
In rare cases, more than two PK parameters are measur- 
able. These over-constrained systems present a unique 
means to test for consistent strong-field gravitational the- 
ories (Taylor & Weisberg 1989). 

In Table 1 and Table 2 we compile a comprehensive list 
of well-measured masses. We include the mass estimates 
along with the 68% confidence limits which are plotted 
on Figure 1. 

We aim to prevent possible contamination of the sam- 
ple with sub-populations which may have gone through 
different and not well understood evolutionary paths 
(e.g., isolated NSs). Even for the better constrained for- 
mation processes that lead to the production of DNS and 
NS-WD systems, theoretical models estimating the final 
NS masses are tentative. 



' = e ( £ ) T 2 J 3 M^ 3 m 2 (m, + 2m 2 ) , (9) 



1/3 



r = T Q m 2 , 



p b X-2/3 



(10) 



4. EVOLUTION 

While the production channels proposed for DNS and 
NS-WD systems are very diverse, the precise orbital pa- 
rameters derived from radio observations allow us to 
probe the viability of these models, and hence extract 
constraints on their mass evolution. We therefore limit 
our analysis to DNS and NS-WD systems for which 
testable evolutionary models and reliable mass measure- 
ments exist. 
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Figure 1. Measured masses of radio pulsars. All error bars 
indicate the central 68% confidence limits. Vertical solid lines 
are the peak values of the underlying mass distribution for DNS 
(m = 1.35 M ) and NS-WD (m = 1.50 M ) systems. The dashed 
and dotted vertical lines show the central 68% and 95% predictive 
probability intervals of the underlying mass distribution shown in 
Figure 2. "*" points to pulsars found in globular clusters. 



4.1. Double Neutron Star Systems 

Several scenarios have been suggested for the formation 
of DNS systems. Each NS in DNS systems is believed to 
originate from massive main sequence stars with masses 
that exceed 8M Q . While the formation sequence and 
processes are not well understood, the first formed NS 
produced by the more massive primary may initially ac- 
cumulate additional mass through wind accretion when 
the less massive secondary continues to undergo nuclear 
evolution during the early phases of the red giant branch. 
In the standard scenario, the system enters a high-mass 
X-ray binary (HMXB) phase in which unstable mass 
transfer leads to a common envelope evolution. The 
first formed NS is then expected to accumulate addi- 
tional mass during this phase and form pulsars such as 
B1913+16. A double-core model has also been suggested 
in which a He and a CO star evolves through a common 
envelope phase following the initial Roche-lobe overflow 
(Podsiadlowski et al. 2005). After the common envelope 
and a second phase of mass transfer, DNS systems such 
as J0737— 3039 can be produced following two consecu- 
tive SN explosions. Alternative evolutionary scenarios in 
which the progenitor of the secondary (less massive) pul- 
sar is a main sequence star with a mass less than 2 M Q 



have also been proposed as a viable production channel 
(Stairs et al. 2006). 

4.2. Neutron Star-White Dwarf Systems 

The evolutionary paths that may lead to the forma- 
tion of NS-WD systems include possible episodes of ac- 
cretion through wind, disk or a common envelope. Sec- 
ular disk accretion is generally accepted as the domi- 
nant process of mass transfer for long-period NS-WD sys- 
tems with low-mass white dwarf companions (e.g., PSR 
J1713+0747). On the other hand, NSs with more mas- 
sive white dwarf secondaries in close orbit systems are 
expected to go either primarily through a common enve- 
lope phase (e.g., PSR J1157+5112) or Roche-lobe over- 
flow followed by mass transfer through common enve- 
lope (e.g., PSR J1141-6545) (Stairs 2004). It's been also 
suggested that it may be possible to produce NS-WD bi- 
nary systems with orbital parameters that resemble PSR 
J2145— 0750 if the donor stars fill their Roche-lobe on the 
asymptotic giant branch (van den Heuvel 1994). 

5. ESTIMATING THE UNDERLYING MASS 
DISTRIBUTION 

Recent advances in statistical methods have reached a 
level which allows us to extract information from sparse 
data with unprecedented detail. Generally, there is an 
inverse correlation between the level of sophistication of 
the model and the confidence of the prediction. By dy- 
namically measuring the performance (see §6) one can 
choose an optimal level of detail to be implemented into 
the model. 

It can be clearly argued why modeling the underly- 
ing NS mass distribution as a single homogenous pop- 
ulation is over-simplistic. There is no compelling line 
of reasoning that would require a single coherent (uni- 
modal) mass distribution for NSs that we know have dis- 
similar evolutionary histories and possibly different pro- 
duction channels (e.g., see Podsiadlowski et al. 2004). 
In fact, there is an increasing number of measurements 
that show clear signatures for masses that deviate from 
the canonical value of 1.4 M©. For instance, recent find- 
ings of van Kerkwijk et al. (2010) imply that the mass for 
PSR B1957+20 may be as high as 2.4±0.12 M Q . Many of 
NSs in globular clusters also show systematically higher 
masses (see Freire et al. 2008b). Therefore, it is neces- 
sary to infer the implied mass distributions separately 
for different sub-populations (DNS vs. NS-WD). As we 
show in §6, an extensively tested and calibrated numeri- 
cal method can then be used to test whether the implied 
masses belong to the same distribution. We argue that 
with the number of secure mass measurements available 
(Table 1 and Table 2), clear signatures should be man- 
ifest in the inferred underlying mass distributions if ap- 
propriate statistical techniques are utilized. Since we still 
operate in the sparse data regime, it is useful, if not nec- 
essary, to use Bayesian inference methods. 

For the range of calculations we use mass measure- 
ments obtained directly from pulsar timing. The meth- 
ods used for estimating NS masses other than radio tim- 
ing, have intrinsically different systematics, and there- 
fore require a more careful treatment when assessing the 
implied NS mass distribution. The inclusion of mass es- 
timates of NSs in X-ray binaries along with these more 



6 



Kiziltan ct al. 



secure measurements would potentially perturb the ho- 
mogeneity of the sample and the coherence of the infer- 
ence. 

For an all inclusive assessment of NS masses, more 
sophisticated hierarchical inference methods may be re- 
quired. For sparse data, a proper statistical treatment of 
different systematic effects and a priori assumptions is 
not trivial. Also, the expected loss in precision may out- 
weigh the gain obtained from a more detailed approach. 
Without properly tested and calibrated tools, further in- 
clusion of NSs whose masses are not measured by pulsar 
timing may just contaminate the sample and can there- 
fore be misleading (e.g., see Steiner et al. 2010). 

5.1. Statistical Model 

Here, we present the statistical model used for estimat- 
ing the NS mass distribution. The approach is based on a 
formulation that incorporates measurement errors of NS 
mass estimates. Specifically, we perform our calculations 
for mass distributions characterized by 



M-i + Wi, i = 1, n 



(12) 



where m is the pulsar mass estimate and M. is the NS 
mass with an associated if error. We assume a normal 
NS mass distribution, N(A4; [i, ct 2 ), with mean // and 
variance ct 2 . The errors (w^, associated with the pul- 
sar mass estimates (m*) are assumed to arise from nor- 
mal distributions iV(0, Sf). The observation specific er- 
ror variances (Sf) are obtained from the error bands of 
pulsar observations (i.e., Table 1 and Table 2). Assum- 
ing independence between the normal distributions for 
M and w, the probability model described above yields 



N(m;n,a 2 + S 2 ) 



(13) 



distribution for the NS mass estimates. 

Therefore, the likelihood function for the NS mass dis- 
tribution parameters (/x,ct 2 ) is given by 



C(^i 1 ct 2 ; data) = 

n 

Y[[2n(a 2 + S 2 ) 



-1/2 



_(m i -M)f_ 



(14) 



(for derivation see §A). Here, the data vector comprises 
the observed mass estimates m^, and error variances Sf, 
which are computed using the estimated error bars for 
each m^, i = l,...,n. Numerical maximization of the 
likelihood function yields maximum likelihood estimates 
for the Gaussian mean fi and half-width ct implied by 
pulsar observations. 

However, the general non-standard fashion of how ct 2 
enters the expression for the likelihood function, the un- 
certainty quantification for the point estimates of /z and 
ct, and the subsequent effect on NS mass density esti- 
mates would require asymptotic results for likelihood- 
based confidence intervals. Given that the likelihood ap- 
proach relies on large sample sizes for uncertainty esti- 
mates, this can be especially problematic where the num- 
ber of mass estimates from DNS and NS-WD systems are 
small. 

We thus employ a Bayesian approach to modeling 
and inference of the NS mass distribution. Under 
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Figure 2. Posterior predictive density estimates for the neutron 
star mass distribution. DNS and NS-WD systems have respective 
peaks at 1.35 Mq and 1.50 Mq. Probability densities are normal- 
ized to show the 95% posterior probability range. The solid parts 
of the curves show the central 68% probability range which cor- 
respond to 1.35 ± 0.13 M Q and 1.50 ± 0.25 M for the DNS and 
NS-WD system, respectively. 

the Bayesian model formulation, the likelihood function 
C(/i, ct 2 ; data) is combined with (independent) prior dis- 
tributions 7r(/z) and 7t(ct 2 ) for [i and ct 2 to obtain the 
posterior distribution for the model parameters, given 
the data, 

p(^,a 2 | data) = C _1 7r(/i)7r(CT 2 )/:(^,CT 2 ;data). (15) 

We work with a normal prior for \i with mean a and vari- 
ance b 2 , and an inverse-gamma prior for ct 2 with mean 
d/(c — 1) for c > 1 (see §A for details). The normal- 
izing constant of the posterior distribution involves the 
marginal likelihood for the data, that is, 



(16) 



C = I 7t(/i)7t(ct ct ; data) d/ida 



The posterior density is not available in closed form, 
since the integral for the normalizing constant cannot 
be analytically evaluated. We therefore resort to a 
Markov chain Monte Carlo (MCMC) approach to sam- 
pling from the posterior distribution (see Gelman et al. 
2003). MCMC posterior inference is based on simula- 
tion from a Markov chain whose stationary distribution is 
given by the posterior distribution for the model param- 
eters. As detailed in §A, the MCMC algorithm samples 
dynamically from the posterior full conditional distribu- 
tions for [i and ct 2 . The former is a normal distribution 
and hence readily sampled; the latter is not of a stan- 
dard form and thus a Metropolis-Hastings (M-H) step is 
used to sample from the conditional posterior distribu- 
tion of ct 2 . The resulting posterior samples for (n,cr 2 ) 
can be used for full and exact inference for the model 
parameters [i and ct 2 . More importantly, the posteriors 
for (/i, ct 2 ) are used to infer the NS mass distribution. 

In a Bayesian approach, the posterior predictive den- 
sity, denoted by V(M.<z | data), provides the estimate 
for the density of the NS mass distribution. Formally, 
V(Mo | data) is the distribution for a "new" unobserved 
pulsar with unknown mass A4q, which we seek to esti- 
mate (predict) given the observed data. Following the 
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Figure 3. Likelihood surfaces (a) and posterior densities (b) of 
model parameters fi and a for the NS mass distribution. In each 
panel, the tight contours on the left and wider contours on the 
right correspond to the data from DNS and NS-WD systems, re- 
spectively. 

derivation in §B, the posterior predictive density can be 
derived as 

P(M I data) = 

N(Mo;/i,a 2 )p(fi,a 2 | data) dfi da 2 



(17) 



and can thus be readily estimated using the MCMC sam- 
ples from the posterior distribution p(p,a 2 | data). Fig- 
ure 2 shows the inferred posterior predictive NS mass 
densities for the DNS and NS-WD systems. The 68% 
and 95% predictive probability intervals inferred from 
Figure 2 are projected onto Figure 1 as vertical lines. 

5.2. Comparison with Likelihood Estimation 

We have taken a comparative approach to probe the 
underlying mass distribution of NSs. In particular, in 
order to assess the sensitivity of the results to the tech- 
nique that is used for inference, we studied the shape of 
the posterior density p(fi, a 2 | data) under different pri- 
ors, ir(p) and 7r(<7 2 ), relative to the likelihood function 
for 0,tr 2 )._ 

The likelihood surface contours for the Gaussian mean 
p and half- width a arc shown in Figure 3(a) for DNS and 
NS-WD systems. Figure 3(b) plots the corresponding 
contours of the posterior density under priors (2) and (3) 
in Figure 4 for DNS and NS-WD systems, respectively. 

It is a good indication for the model and method ac- 
ceptability if the compared parameter estimates are not 
drastically different. Indeed, with regard to the implied 
estimates for fj, and er, the difference between the maxi- 
mum likelihood and the Baycsian approach appears prac- 
tically insignificant. Section 6 includes a more detailed 
study of the effect of the prior choice on posterior infer- 
ence results under the proposed Baycsian model. 

6. PRIOR CHOICE AND ALGORITHM 
PERFORMANCE 

6.1. Approach to Prior Choice 
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Figure 4. Plot of the prior predictive NS mass densities under 
four different prior choices tested for performance. Each prior is 
defined by hyper-parameters (a,b,c,d) (see Equation Al and Equa- 
tion A2). 



In a Baycsian framework, weakly informative priors 
are desirable for sparse data. Non-informative priors can 
have strong and undesirable implications that lead to ar- 
tificially biased inferences. Weakly informative priors, 
however, will let the data tune the posterior more opti- 
mally while being strong enough to exclude various "un- 
physical" possibilities (Gelman et al. 2003; Robert 2007). 

In order to prevent introducing cither a strongly infor- 
mative bias or loss of information, we test a vast range 
of priors on simulated samples. The most natural way to 
study the effect of the prior choice is through the prior 
predictive density, denoted by V(Aio), which yields the 
prior estimate of the density for the NS mass distribution, 
that is, before data is used. Analogously to the expres- 
sion for the posterior predictive density (see Equation 
17), the prior predictive density is defined by 



r(Mo) 



N(M ; p, er 2 ) ir(p)ir(a 2 ) d[i da 2 



Hence, V(A4q) can be estimated by Monte Carlo integra- 
tion of the N(Mq; p, a 2 ) density, using samples from the 
prior distributions. A series of results are produced for 
priors (for p and a 2 ) that we incrementally tune to ob- 
tain prior predictive densities with dispersions that range 
from fairly non-informative (practically flat) to very con- 
centrated shapes. A sample range is shown in Figure 4 for 
four combinations of hyper-parameters (a, b) and (c, d) 
that define the prior for /i and cr 2 , respectively. We then 
use these priors to test the effects of the choice on the 
posterior distribution. 

The shape of a "weakly informative" prior will dynam- 
ically change with the data sample that is used for infer- 
ence. For instance, for very tightly bound data sets such 
as DNS systems, a sharply peaked prior can qualify as 
weakly informative, while the same prior will certainly 
be more informative for a more dispersed data set such 
as NS-WD systems. Our goal is to quantitatively find an 
optimum choice rather than qualitatively assign priors. 

Once we find optimum priors for DNS and NS-WD 
systems (see also §6.2), we use Monte-Carlo simulations 
to numerically estimate the accuracy level we can reach 
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Figure 5. Performance of the inference algorithm for (a) NS-WD and (b) DNS systems. A pea k is the fractional percentage deviation of 
the inferred value from the real peak. The optimum choice of prior is determined by minimizing x 2 and the fractional mean deviation from 
the input value. Priors (2) and (3) from Figure 4 perform best for DNS and NS-WD systems respectively. 
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Figure 6. Performance of the inference algorithm for (a) NS-WD and (b) DNS systems. A m ax is the fractional percentage deviation of 
the inferred value from the real maximum cut-off. The optimum choice of prior is determined by minimizing \ 2 an d the fractional mean 
deviation from the input value. Priors (2) and (3) from Figure 4 perform best for DNS and NS-WD systems respectively. 



with our approach. 

6.2. Accuracy and Error Estimation 

In order to test the performance of the algorithm that 
we use to infer the underlying NS mass distribution, we 
run a series (i.e. 10 5 ) of simulations. For each step, we 
construct distributions that have random peaks between 
fi =[0,3] Mq with associated random variances between 
a =[0,0.3]. From these distributions, we randomly select 
the same number of samples as our "observed" sample. 
Next, we artificially corrupt these simulated samples by 



introducing random errors consistent with uncertainties 
in observations from the real data. We then use these 
smeared samples to test whether we can recover the "in- 
put /known" distribution. 

To quantify performance we calculate the variance of 
two quantities. For each realization we compare the 
peaks and the maximum cutoff values of the input dis- 
tribution and that of the predicted posterior we obtain 
with our algorithm. 

Figure 5 shows the fractional deviation from the input 
peak as a function of the intrinsic variance. The per- 
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formance of the inference algorithm is quantified for NS- 
WD and DNS systems separately in Figure 5(a) and Fig- 
ure 5(b), respectively. The four panels illustrate the per- 
formance for the priors shown in Figure 4. The numbers 
1-4 on the upper right corner of each panel of Figure 5 re- 
fer to priors which incrementally widen into almost a flat 
uninformative prior (see Figure 4 for the corresponding 
hyper-parameters) . 

The performance is measured by the absolute value of 
A which is the fractional difference between the "input" 
and "inferred" value. The behavior of the variance x 2 
is also used in conjunction as a measure of performance 
consistency. 

We repeat the same procedure in Figure 6 to quantify 
performance for recovering the maximum cutoff value. 

Clearly shown by varying performances in Figure 5 and 
Figure 6, the priors that can be used to reliably infer 
the underlying distribution for DNS and NS-WD systems 
have to bear different characteristics. To infer the "peak" 
and the "maximum cutoff value" for DNS and NS-WD 
systems, priors (2) and (3) in Figure 4 perform best, 
respectively, by producing the most accurate inferences. 

For simulated tight distributions similar to DNS sys- 
tems, the mean accuracy of the inferred peak value is 
better than 99%. For wider distributions resembling NS- 
WD systems, the predicted peak has > 98% accuracy. 
Both for DNS and NS-WD systems, we find that the 
underlying real maximum cutoff value cannot be larger 
than 10% of the inferred value. 

We also run a series of simulations in order to examine 
whether the algorithm can detect distributions that have 
been artificially skewed at varying levels and directions. 
Even at modest levels of input skewness, we find that 
the inferred shape (whether it exhibits skewness or not) 
is consistent with the underlying distribution with more 
than 99% confidence (see §8.2 and §8.3 for discussion). 

6.3. MCMC Algorithm Performance 

The technical core of Bayesian inference for models 
with analytically intractable posterior distributions is to 
use MCMC based algorithms. It is important to con- 
duct detailed performance tests for the MCMC method 
used for inference in order to assess probabilistic biases 
and sources of sensitivity. Especially for sparse data, 
the probability distribution function (pdf), and hence 
the inferred confidence contours can be very sensitive 
to sampling steps. For instance, if the sampling accep- 
tance rates for the M-H steps are not closely monitored 
and the inference process is not tuned for optimum sam- 
pling, over- or under-sampling, especially from the tails 
of the posterior distribution, will bias the inference. For 
sparse data, the introduced bias can potentially be very 
significant. 

It has been shown that for univariate target distri- 
butions the optimum acceptance rates for M-H based 
MCMC algorithms lie between 45-55%, which may go 
down to as much as 23% for multivariate target distri- 
butions (Roberts et al. 1997). For univariate distribu- 
tions, acceptance rates higher than 60% are indicative 
of over-sampling. Therefore, artificially high acceptance 
rates will result in underestimating the width of the dis- 
tribution. On the other hand, acceptance rates below 
30% will over estimate the spread of the confidence con- 
tours by under sampling the parameter space (Roberts 
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Figure 7. Autocorrelation plots for consecutive MCMC steps 
sampling the parameter space of the mean neutron star mass fi 
and the Gaussian half width <r. 



& Rosenthal 2001). We tune the M-H sampling steps to 
keep the acceptance rate in the optimum 45-55% range. 

Further quantitative insight into whether the MCMC 
sampler operates optimally can be gained by calculating 
the autocorrelation function for consecutively sampled 
model parameters. Each successive step will have a cer- 
tain level of correlation, since they are simulated from a 
Markov chain. The extent of autocorrelation is an indi- 
cation for the efficiency and, in general, performance of 
a particular MCMC algorithm. 

We calculate the autocorrelation function for both 
steps of the MCMC algorithm that sample the Gaussian 
mean /i and half-width a parameter. Figure 7 shows that 
the steps in the MCMC sampling scheme for both \i and 
a have very small autocorrelation, which suggests that 
the MCMC algorithm efficiently explores the posterior 
distribution for the model parameters. 

7. SUMMARY 

We overview the physical processes that tune masses 
of NSs in §2. In order to theoretically estimate the vi- 
able range for NS masses, we derive the birth mass (§2.1, 
Mforth = 1.08-1.57M Q ) and the amount of mass ex- 
pected to be transferred onto recycled NSs during the 
binary phase (§2.2, Am acc w 0.1-0.2 M Q ). We then 
discuss why the constraints on the maximum NS mass 
(M max = 1.5-3.2 Mq) are less stringent and comment 
on the sources of uncertainties in §2.3. 

In order to maintain a uniform approach in our anal- 
ysis, we refrained from including additional constraints 
that may arise from assumptions such as the possible 
relationship between the binary period and the mass of 
the remnant white dwarf (i.e., P& — ttt-2 relationship) sug- 
gested by Rappaport et al. (1995). While more elab- 
orate and hierarchical implementation methods may be 
utilized in deducing ramifications of other assumptions, a 
use of more inclusive approaches may only convolute the 
mass inference, which is contrary to the goal of this work. 
Throughout our analysis, we only assume that Einstein's 
prescription for general relativity is correct and include 
mass measurements which are considered secure (§3). 
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In §4 we point to the diversity of proposed evolutionary 
scenarios for DNS and NS-WD systems. Despite the lim- 
itations of constraints on their evolution, unlike isolated 
NSs, for pulsars in binary systems the precise measure- 
ments of the orbital parameters offer additional leverage 
to constrain the production channels. 

We then subject the pulsar mass measurements to a de- 
tailed statistical analysis. In §5 we show that a Bayesian 
method offers an effective means for inference. To alle- 
viate the subjective nature, we use Markovian decision 
making algorithms in choosing the priors that produce 
the most accurate prediction for each sub-population. 
After we calculate the underlying NS mass distribution 
through posterior predictive densities, in §6 we use a 
large sample of simulated and artificially corrupted data 
to test the performance of this approach and quantify 
uncertainty. The width of the underlying distribution of 
NSs shown in Figure 2 is then projected onto Figure 1 
for visual reference. 

8. DISCUSSION AND CONCLUSIONS 
8.1. Previous Studies 

The first article that reviewed pulsar mass mea- 
surements in order to deduce the range of masses 
NSs can attain, was published by Joss & Rappaport 
(1976). They used mass measurements from 5 sources 
(PSR B1913+16, Her X-l, Cen X-3, SMC X-l, and 
3U0900— 40), which were predominantly X-ray sources, 
and found a marginally consistent range of 1.4-1.8 M Q . 

Finn (1994) attempted to use Bayesian statistical tech- 
niques for the first time to infer limits on the NS mass 
distribution. By using the mass measurements of only 4 
radio pulsars (PSRs B1913+16, B1534+12, B2127+11C, 
and B2303+46), he concluded that NS masses should fall 
mainly in the range between 1.3-1.6 M©. The statistical 
approach he utilized did not, however, offer a means to 
measure the reliability and the predictive power. 

A comprehensive paper on pulsar masses was published 
by Thorsett & Chakrabarty (1999). Their analysis based 
on 26 sources yielded a remarkably tight mass range at 

1.38io'.io M ©- Tne width of their NS mass inference was 
mainly driven by the narrow error bands of the DNS 
mass measurements. 

The recent work by Schwab et al. (2010) suffers from 
other limitations. They analyze masses of 14 sources 
with an approach based on the comparison of the cu- 
mulative distribution function (CDF) with an idealized 
Gaussian. It is well understood that the K-S test should 
be used with caution in cases where deviations occur in 
the tails (Mason & Schuenemeyer 1983). Additionally, 
even in data samples where the number of outliers in 
the tails arc considerably larger and associated measure- 
ment errors are taken into account, a K-S approach will 
still remain inadequate in quantifying the significance of 
the outliers. Therefore, while the bimodal feature found 
for the initial mass function (i.e. Mbi rt h) may be con- 
sistent with theoretical expectations for remnant masses 
produced by electron-capture versus Fe-core collapse SNc 
(Podsiadlowski et al. 2004), the evidence for a deviation 
of Mforth from a unimodal distribution is still tentative. 
In order to firmly establish a potential multi-modal fea- 
ture for the NS birth mass distribution, a more diverse 
sample tested with more rigorous statistics are required. 



8.2. Statistical Approach to Infer Underlying 
Distributions 

While the statistical model we use and the inference 
method we develop (§5.1) are specifically tailored for pul- 
sar observations, it is detailed and modular enough that 
makes it a useful generic tool. There are several levels of 
challenges when adapting Bayesian methods: (1) There 
is a wide misconception on how priors should be chosen 
for Bayesian inferences and the level of subjectivity they 
inject into the prediction. (2) Once an appropriate prior 
is chosen, the Monte-Carlo method used for parameter 
estimation has to follow Markovian steps rather than just 
random sampling. (3) Then, in order to prevent over- or 
under-sampling, the process has to be tuned for optimal 
sampling. This can be achieved by closely monitoring the 
acceptance rates of the MCMC steps. (4) Another im- 
portant step for building a robust approach is to subject 
the algorithm to rigorous tests where simulated data with 
realistic errors and biases are used. (5) It is imperative 
to quantify the predictive power of the algorithm. Most 
convincingly, as demonstrated in Figure 5 and Figure 6, 
this can be accomplished by monitoring how much the 
inferred values deviate from simulated and corrupted in- 
put. In our case, we produce 10 5 distributions with ran- 
dom peaks between /i =[0,3.0] M© and variances between 
a =[0,0.3] which are then used for the testing process. 
This procedure yields independent confidence estimates 
for the peak and maximum cutoff values. 

The power of Bayesian inference is not in parameter 
estimation alone, but more in producing realistic pre- 
dictions (see Figure 2). As demonstrated in §6, the re- 
liability of Bayesian predictions can be quantified. We 
show that the method we use to infer the underlying NS 
mass distribution performs remarkably well. Even for 
limited samples, we achieve ~99% accuracy in predict- 
ing the peak and maximum cutoff values. In order to be 
conservative in our estimates, we run our testing proce- 
dure with 17 randomly chosen samples where, in fact, we 
have 18 well measured masses both for DNS and NS-WD 
systems. 

Another parameter that we are interested in is the po- 
tential skewness of the underlying NS mass distribution. 
A universal EOS that consistently describes the micro- 
physics of NS matter will induce a truncation limit on 
the underlying distribution. Consequently, such a trun- 
cation limit, if it exists, will define the transition region 
at the high mass end where NSs are expected to collapse 
into stellar mass black holes — this limit will delineate 
where stellar mass black holes form. 

8.3. Maximum Mass Limit 

We test whether the method is sensitive enough to de- 
tect signatures of a potential truncation, particularly at 
the high mass end of the underlying NS mass distribu- 
tion. We find that even in cases where a mild trunca- 
tion is imposed onto the input distribution, the algorithm 
produces results with ^99% consistency. Both predicted 
distributions for NSs in DNS and NS-WD systems in Fig- 
ure 2 are consistent with symmetric shapes (both with 
skewness parameter |7i| < 0.06), and show no signs of 
deviation in favor of a truncation on cither end. 

This has important ramifications: The lack of trun- 
cation indicates that, in particular, the high mass end 
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is driven by evolutionary constraints. Evolutionary pro- 
cesses such as long term stable accretion will naturally 
produce a wider distribution, while constraints due to 
general relativity or a universal EOS would produce a 
strict upper limit, which would manifest itself as a trun- 
cation limit. The lack of truncation in the shape of the 
underlying NS mass distribution, as a result, rules out 
the possibility that the upper mass limit is set by gen- 
eral relativity or the EOS. Therefore, the 2 M maximum 
mass limit implied by NS-WD systems should be consid- 
ered as a minimum secure limit to the maximum NS mass 
rather than an absolute upper limit to NS masses. 

8.4. Central Density and the Equation of State 

All EOSs that require a maximum NS mass M max < 
2M are ruled out. The implied stiffness of the EOS 
largely precludes the presence of meson condensates and 
hypcrons at supranuclear densities. Consequently, lower 
central densities, larger radii and thicker crusts for NSs 
are favored (Shapiro & Teukolsky 1983). 

The energy density-radius relation implied by Tolman 
(1939), when combined with the causality limit, gives 
an analytical solution for an upper limit on the central 
density 

p c M 2 = 15.3 x 1O 15 M gem" 3 . (18) 

With a 2 M secure lower limit on the maximum NS 
mass, we set a 95% confidence upper limit to the central 
density of NSs, which is 

p max <3.83x 10 15 gcm- 3 (19) 

corresponding to ~llp s for a fiducial saturation thresh- 
old n s ~ 0.16 fur 3 . 

Exotic matter such as hyperons and Bose condensates 
significantly reduce the maximum mass of NSs. There- 
fore, a strict lower limit on the maximum NS mass 
M max > 2 M rules out soft EOSs with extreme low den- 
sity softening and require the existence of exotic hadronic 
matter (see Lattimer & Prakash 2007, for review). NSs 
with deconfincd strange quark matter mostly have max- 
imum predicted masses lower than 2M . Hence, EOSs 
with strange quark matter that predict maximum masses 
smaller than 2 Mq can also be ruled out as viable config- 
urations for NS matter. 

8.5. Evidence for Alternative Evolution and the 

Formation of Massive Neutron Stars ? 

A 2 Mo upper limit to masses of NSs in NS-WD sys- 
tem poses a problem. If all millisecond pulsars were in- 
deed NSs that are recycled from a first generation of nor- 
mal pulsars, the implied distribution should be consistent 
with a recycled version of the initial mass distribution. 
While the peaks of the distributions for DNS and NS-WD 
systems are consistent with the expectations of standard 
recycling (§2.2), the widths imply otherwise. As shown 
in Figure 2, Am acc = 0.15 M lies perfectly within the 
expected range. However, with typical accretion rates ex- 
perienced during the LMXB phase (m acc ~ 10~ 3 MEdd), 
NSs with masses ~2 M such as PSR J1614-2230 can- 
not be formed. Even with initial masses of ~1.6 M these 
sources need to accrete Am ~ 0.4 M during their active 
accretion phase. This requires long term stable active ac- 
cretion at unusually high rates. 



Based on the P-P demographics of millisecond pulsars, 
Kiziltan & Thorsett (2009) argue that sa 30% of the mil- 
lisecond pulsar population may be produced via a non- 
standard evolutionary channel. This prediction falls in 
line with a distribution that has a consistent recycled 
peak but has an unusual width which extends up to 2 M . 
While it is difficult to quantify the formation rate(s) of 
non-standard processes that may produce these NSs, it 
is clear that the standard scenario requires at least a 
revision. Such a revision should consistently reconcile 
for the observed P-P distribution of millisecond pulsars, 
along with the long term sustainability of unusually high 
accretion rates that is required to produce the second 
generation of massive NSs. 

The only viable alternative to a major revision of the 
mass evolution implied by the standard recycling sce- 
nario, also corroborated by the lack of truncation of the 
underlying NS mass distribution, is then to form massive 
NSs. 
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APPENDIX 



A. INFERENCE UNDER THE BAYESIAN MODEL FOR THE NEUTRON STAR MASS 

DISTRIBUTION 

A.l. Model Formulation 

Based on the statistical model formulation for the NS mass distribution developed in Section 5.1, the likelihood for 
the data {(mi, Si) : i = l,...,n} is given by 



Ciji, a 2 ; data) = J| [2Tt(a 2 + S 2 )] 



-1/2 



(jm - m) 2 
((* 2 + S?) 



(14) 



The Bayesian model is completed with independent normal N(a, b 2 ) and inverse-gamma IG(c, d) priors for \i and 
tr 2 , respectively. Specifically, 



7t(m) = (27rfo 2 )- 1/2 exp 



(p a)' 



2b 2 



and 



n(a 2 ) 



d c exp(-^) 
r ( c ) CT 2(c+i) 



(Al) 



(A2) 



for fixed hyper-parameters (a, 6, c, d). Note that the prior mean for a 2 is given by dj (c — 1) (provided c > 1). 

Combining the likelihood with the priors for \i and tr 2 , the posterior distribution for the model parameters can be 
written proportional to 



p(/i,a 2 | data) cx 7r(^)7r(CT 2 )£(/i, cr 2 ; data) 

(fi-a) 2 ] exp(-£) 



cx exp 



2b 2 



r 2(c+l) 



II(- 2 + ^)- 1/2 exp -i£ 



(mi - n) 2 
^(- 2 + S 2 ) 



As discussed in Section 5.1, although the normalizing constant for the posterior density is not available in closed form, 
we can utilize MCMC sampling from p(/i,a 2 | data), which results in full inference of the model parameters, as well 
as the posterior prediction of the NS mass distribution. 

A. 2. MCMC Posterior Simulation Method 

The MCMC algorithm updates dynamically the two parameters, \x and a 2 , by sampling from their posterior full 
conditional distributions. After convergence, the resulting samples are realizations from the posterior distribution 
p(fx, a 2 | data). 

The posterior full conditional distribution for fi is proportional to 



p(fi | ex 2 , data) oc exp 



2b 2 



exp 



1 n 



(mj - fi) 2 
i(<r 2 + Sf) 



an expression which can be completed to a normal distribution with mean 



and variance 



b 2 = 



b 2 

i + ^ELi^ + s 2 )- 1 



However, the posterior full conditional for a 2 , 



(mi - fi) 2 



p(o- 2 | M ,data) ocpV) = {fi^ + $r 1 "} ^ {~\ E J^S?) j ' 

can not be recognized as a standard distributional form, and we thus use a Metropolis-Hasting (M-H) step to update 
a 2 given \i. For the M-H proposal distribution, we use the full conditional for a 2 from the special case of the model 
that does not include errors in measurement for the pulsar mass estimates (i.e. S 2 = 0, for all i). Under this simplified 
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version of the model, a 2 has an inverse-gamma IG(c, d) posterior full conditional distribution, where c = 0.5n + c and 

J=d + o.5£r=iK-A0 2 - 

Denoting by a 2 ^ the current state of the Markov chain for a 2 , the M-H step proceeds as follows: We draw a 
proposed value a 2 from the IG(c, d) distribution, and then compute the acceptance probability 

f P *(a 2 ) g(a 2 ^) 
g = min 1, ^\ t > x ' y 



P*(ct 2 W) g{a 2 ) 



where 



g(u) = d 



rgexp(-J/u) 



T(c> 



denotes the density of the IG(c,d) distribution, and as indicated above, p*(<J 2 ) is the density of the un-normalized 
posterior full conditional distribution for a 2 . Then, we obtain the new state of the chain for a 2 through 



a 2(t+i) 



a 2 with probability q 
cr 2 '*) with probability 1-q 



that is, a stochastic rejection step to determine whether the proposed value a 2 is accepted. 

Therefore, the MCMC method to sample from the posterior distribution p{fx, a 2 | data) involves overall the following 
iterative procedure: 

• Start with initial values O (0) , t7 2(0) ). 

• If the current iteration is (ii^\a 2 ^), obtain the next iteration (fj,( t+1 \ er 2 ( t+1 )) through the following two updates: 

— Draw from N(d, b 2 ), where 5, and b 2 are computed using a 2 = a 2 ^ . 

— Draw er 2( - t+1 ) using the M-H step, where both p*(-) and g(-) are evaluated using p, = fi^ t+1 \ 

B. POSTERIOR PREDICTIVE DISTRIBUTION 

The posterior samples from p(fi, a 2 | data) can be used to estimate the density of the NS mass distribution. The 
formal Bayesian estimate is given by the posterior predictive density V(Mo | data) (which is the posterior density 
given the observed data), for a "new" unobserved pulsar with unknown mass Mq. Hence, A4q is a parameter that we 
seek to estimate under the full Bayesian model that includes also the NS mass distribution parameters fi and a 2 . 

Based on the model for the observed mass estimates developed in Section 5.1, the augmented model that incorporates 
Mo involves two new terms: a N(A4o; [i, a 2 ) distribution for the unknown mass .A/Jo, and a N(wq;0, Sq) component 
for the error that would arise if we were to observe the pulsar that has a mass estimate mo associated with A4q. Now, 
this model can be marginalized over w to obtain the joint posterior distribution for A4q and (n,a 2 ), 

p(M a ,n,a 2 | data) oc N(Mq; /i, a 2 ) N(mf, /i, a 2 + S 2 )^j 7r(fi)n(a 2 ) = N(Mq; p, a 2 )p(pL, a 2 \ data). 

Finally, V{M.o \ data) is obtained by marginalizing the joint posterior p(M.$, /i, a 2 \ data) over fj, and a 2 : 

V(M a I data) = / / N(M a ; pi, a 2 ) p(fi, a 2 \ data) d/J, da 2 ( 17). 



Note therefore that the Bayesian estimate, V{M.o \ data), for the NS mass density incorporates uncertainty for 
parameters (/x,ct 2 ) by averaging over their posterior distribution. This can be contrasted with the predictive density 
under the likelihood approach, where the N{M.q] [i, a 2 ) NS mass density would simply be estimated by replacing the 
parameter vector (fi,a 2 ) with its maximum likelihood estimate. 

The posterior predictive density V(A4q \ data) can then be readily estimated through straightforward Monte Carlo 
integration suggested by Equation 17, using the MCMC samples from the posterior distribution p(/j,, o 2 \ data). 



